Unpivot
The Unpivot node is used to convert cross tabulated grids (often referred to as 'crosstabs' or matrix grids) into a tabular format (generally referred to as a table or list). Cross tabulation is used to organize and aggregate raw categorical data into a grid format, which is useful for comparing and analyzing data. This type of grid is particularly common when working in Excel, where users can easily create cross tabulated grids and pivot tables.
How to Unpivot a Table
Here, we have a spreadsheet containing a matrix grid.
Step 1. Connect the Unpivot Node
When the table is configured in the ETL, we see in the Preview panel that the table's structure is not logical. This is because the headers and values in the cross tab grid are being displayed in a tabular format, without having been restructured.
It's clear that the table, if left as is, cannot be queried. In order to make the table useable, the following changes must be made:
- The manufacturer headers (red highlight below) need to be reorganized into a tabular column.
- The 2 measures, net profit and expenses (yellow highlight), need to be converted into 2 tabular columns.
- 'Column 1' (green highlight) should be organized into a tabular column called 'dateKey'; the first row in the column should be assigned as the column's header.
- 'Column 2' (blue highlight) should be organized into a tabular column called 'Product Category'; the first row in the column should be assigned as the column's header.
In order to convert the cross tab grid into a tabular format, the Unpivot node must be added. Find the Unpivot node in the Preparations tab:
Connect the Unpivot node to the table node containing the matrix grid, and with the Unpivot node selected, click Settings from the Properties panel:
Step 2. Configure the Unpivot Settings
You will be presented with the Unpivot Settings dialog, where you must select which rows and columns should be converted into columns:
Convert the First Row to Column Names
If the first row in the crosstab grid contains column headers (as is the case in this example), select 'Column Names as First Row' (orange highlight below). This will move the values in the first row to the column headers.
Convert Categorical Data to Tabular Columns
In this example, we have 2 columns consisting of row headers in the crosstab format: these are dateKey and Product Category, which are the 2 left-most columns which are selected and shaded in blue in the image below. These both need to be converted to columns, as does the row containing manufacturer headers. To convert this categorical data into tabular columns, select the corresponding checkbox (blue arrows below).
Click the Preview button (green arrow) to preview the results.
In the results preview, we see three tabular columns for dateKey, Product Category, and Manufacturer (orange highlight). We also see a Value column containing the Expenses and Net Profit headers; these values must be converted into columns.
Convert Values to Tabular Columns
To convert the values into columns, find the value headers and click the corresponding measure icon (orange arrow below) to select the row or column. Click the preview button again (green arrow) to see the updated preview with the value column(s) (red highlight).
Change Column Names and Hide Columns
From the Flat Results Preview, click a column's edit icon (blue arrow below) to edit the column name, or click the column's hide icon (yellow arrow) to hide the column.
When a column is hidden, the 'Show hidden columns' button is enabled (green highlight). Click this button shows columns that are hidden, but does not un-hide them:
To un-hide a column, click 'Show hidden columns' and then click the 'Show' icon (red arrow) belonging to the hidden column:
Step 3. Apply Changes
Be sure to click the 'Apply' button to confirm your unpivot settings.
Step 4. Preview the Unpivoted Table
Preview the Unpivot node in the data flow to see the tabular structure of the unpivoted table: